

## Introduction to Slide Set 15

- In the prior slide set we considered memory
   consistency or the order that memory accesses to
   multiple locations from one thread may become visible
   to another thread running on a different processor.
- In this slide we consider cache coherence or how an update to a single location on one core becomes visible to threads running on a different core.

# Learning Objectives

By the time we finish talking about this slide set in lectures you should be able to:

- Describe the snoop based approach to implementing shared memory.
- Evaluate which events are generated when loads and stores operate with a simple snoop cache coherence protocol for writeback caches.
- Describe how a directory based cache coherence protocol differs from a snooping cache coherence protocol
- Evaluate which events are generated when loads and stores operate with a simple directory cache coherence protocol



- Processors see different values for u after event 3
- With write back caches, value written back to memory depends on order of which cache writes back value first
- Unacceptable situation for programmers



- Processors see different values for u after event 3
- With write back caches, value written back to memory depends on order of which cache writes back value first
- Unacceptable situation for programmers



- Processors see different values for u after event 3
- With write back caches, value written back to memory depends on order of which cache writes back value first
- Unacceptable situation for programmers



- Processors see different values for u after event 3
- With write back caches, value written back to memory depends on order of which cache writes back value first
- Unacceptable situation for programmers



- Processors see different values for u after event 3
- With write back caches, value written back to memory depends on order of which cache writes back value first
- Unacceptable situation for programmers

## Recall: Cache Coherence Problem



- Processors see different values for u after event 3
- With write back caches, value written back to memory depends on order of which cache writes back value first
- Unacceptable situation for programmers

# Coherence Invariants

1. Single-Writer, Multiple-Reader (SWMR) Invariant



2. Data-Value Invariant. The value of the memory location at the start of an epoch is the same as the value of the memory location at the end of its last read-write epoch.

# **Coherence States**

- How to design system satisfying invariants?
- Track "state" of memory block copies and ensure states changes satisfy invariants.
- Typical states: "modified", "shared", "invalid".
- Mechanism for updating block state called a coherence protocol.



Cache can be accessed by CPU (1) or Bus (2).

We will consider effect of each starting with CPU.

# Implementing Cache Coherence



Requests received from both core and network:

### **Processor request:**

- receive load/store from core
- update state of block
- issue coherence request
- receive response
- update state of block
- reply to core

### **Network request:**

- receive coherence request
- update state of block
- issue coherence response

# Coherence, Simplified

 Initially, assume all actions triggered by a memory request take place atomically. Only consider "stable" coherence states.

 This view ignores time for messages to be sent and received.

 In practice add "transient" states to enable multiple memory requests to occur in parallel.

## Coherence Protocol States



Direct mapped, write-back, write allocate



| Valid | Dirty | State    |
|-------|-------|----------|
| 0     | X     | Invalid  |
| 1     | 0     | Shared   |
| 1     | 1     | Modified |

- Logically, each memory block has a state.
- Typically 2 to 5 states for each block.

# Directory vs. Snoop Protocols

- Directory Coherence Protocol
  - Summarize state of block in all caches in single location called a directory.
  - Advantages: Scalability
  - Disadvantage: Latency, Complexity
- Snoop Coherence
  - For each action ask all caches for state of block
  - Advantages: Fast and simple
  - Disadvantage: Not scalable
- Stable states at L1 cache same for both

# **Snooping** Cache-Coherence Protocols



- Cache Controller "snoops" all transactions on the shared medium (bus or switch)
  - Relevant transaction if for a block it contains
  - Take action to ensure coherence
    - invalidate or supply value
  - Action depends on state of the block and the protocol
- Get exclusive access before write via write invalidate

## MSI Snoop Protocol

- Each <u>memory</u> block is in one state (state is implicit)
  - Clean in all caches and up-to-date in memory (<u>Shared</u>)
  - OR Dirty in exactly one cache (<u>Modified</u>)
  - OR Not in any caches
- Each <u>cache</u> block is in one state (state is explicit):
  - Shared : block can be read
  - OR Modified: cache has only copy, its writeable, and dirty
  - OR <u>Invalid</u>: block contains stale data or never accessed
- Read misses: cause all caches to snoop bus in case they have a modified copy of data (memory may be out of date)
- Write hits to "shared" blocks: treated as misses since we need to tell other processor caches about the write so they can invalidate any shared copy of the data.

Writes need to know whether any other copies of the block is cached

#### Write hit:

- Block in modified state ⇒ No other copies ⇒ No need to place invalidate on bus for writeback cache
- Block in shared state ⇒ may be copies ⇒ Need to place invalidate on bus (will cause other processors to invalidate their copy)

#### Write miss:

 Block in invalid state ⇒ write miss ⇒ Need to read block from memory (as in uniprocessor case) or another processor (if modified state in that other processor's cache).

- State machine for <u>CPU</u> requests for each <u>cache block</u>
- · Non-resident blocks invalid





# Cache Block State



- State machine for <u>CPU</u> requests for each cache block
- Non-resident blocks invalid





# Cache Block State



Starting from "Invalid" which state is next after CPU read?

A: Invalid

B: Shared

C: Modified

- State machine for <u>CPU</u> requests for each cache block
- · Non-resident blocks invalid



# Cache Block State



Starting from "Invalid" which state is next after CPU read?

A: Invalid

B: Shared ✓

C: Modified

- State machine for <u>CPU</u> requests for each cache block
- · Non-resident blocks invalid



# Cache Block State



Starting from "Invalid" which state is next after CPU write?

A: Invalid

B: Shared

C: Modified











 State machine for <u>CPU</u> requests for each <u>cache block</u>





# Cache Block State



 State machine for <u>CPU</u> requests for each <u>cache block</u>





Starting from "Modified" which state do we end up in after a CPU write miss?

A: Invalid

B: Shared

C: Modified

D: Not sure



 State machine for <u>CPU</u> requests for each <u>cache block</u>





Starting from "Modified" which state do we end up in after a CPU write miss?

A: Invalid

B: Shared

C: Modified ✓

D: Not sure



 State machine for <u>CPU</u> requests for each <u>cache block</u>





Starting from "Shared" which state do we end up in after a CPU read miss?

A: Invalid

**B**: Shared

C: Modified

D: Not sure



 State machine for <u>CPU</u> requests for each <u>cache block</u>





CPU Read miss
Place read miss
on bus

Starting from "Shared" which state do we end up in after a CPU read miss?

A: Invalid

B: Shared ✓

C: Exclusive

D: Not sure



 State machine for <u>CPU</u> requests for each <u>cache block</u>





CPU Read miss
Place read miss
on bus

# Cache Block State



 State machine for <u>CPU</u> requests for each <u>cache block</u>





CPU Read miss
Place read miss
on bus

Starting from "Modified" which state do we end up in after a CPU read miss?

A: Invalid

**B**: Shared

C: Modified

D: Not sure



 State machine for <u>CPU</u> requests for each <u>cache block</u>

Starting from "Modified" which state do we end up in after a CPU read miss?

ck

A: Invalid

B: Shared ✓

C: Modified



### Block-replacement

 State machine for <u>CPU</u> requests for each <u>cache block</u>



Cache Block State

 State machine for <u>bus</u> requests for each <u>cache block</u>









State machine for <u>bus</u> requests

Starting from "Modified" which state do we end up in after a bus read miss?

A: Invalid

B: Shared

C: Modified









State machine for <u>bus</u> requests

Starting from "Modified" which state do we end up in after a bus read miss?

A: Invalid

B: Shared ✓

C: Modified





 State machine for <u>bus</u> requests for each <u>cache block</u>





State machine for <u>bus</u> requests

Starting from "Shared" which state do we end up in after a bus write miss?

A: Invalid

B: Shared

C: Modified





 State machine for <u>bus</u> requests

Starting from "Shared" which state do we end up in after a bus write miss?

A: Invalid ✓

B: Shared

C: Modified





 State machine for <u>bus</u> requests for each <u>cache block</u>





State machine for <u>bus</u> requests

Starting from "Modified" which state do we end up in after a bus write miss?

A: Invalid

**B**: Shared

C: Modified





 State machine for **bus** requests

Starting from "Modified" which state do we end up in after a bus write miss?

A: Invalid ✓

B: Shared

C: Modified

D: Not sure



**Read miss** 

Write miss

for this block

for this block Write Back Block; (abort

memory access)

Shared

(read/only)



#### Write-back State Machine-III













| P2             |   |     |    |    |  |  |
|----------------|---|-----|----|----|--|--|
| State Tag Data |   |     |    |    |  |  |
| В0             | M | 120 | 01 | 31 |  |  |
| В1             | M | 128 | 54 | 40 |  |  |
| B2             | S | 110 | 00 | 01 |  |  |
| В3             | I | 118 | 80 | 00 |  |  |

| 3     |             |                                                                                           |                                                                                                                                         |
|-------|-------------|-------------------------------------------------------------------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------|
| State | Tag         | Da                                                                                        | ata                                                                                                                                     |
| I     | 100         | 00                                                                                        | 00                                                                                                                                      |
| S     | 108         | 42                                                                                        | 00                                                                                                                                      |
| I     | 110         | DF                                                                                        | FF                                                                                                                                      |
| M     | 138         | 00                                                                                        | 28                                                                                                                                      |
|       | State I S I | State         Tag           I         100           S         108           I         110 | State         Tag         Date           I         100         00           S         108         42           I         110         DF |

| P4 | Ļ     |     |    |     |
|----|-------|-----|----|-----|
|    | State | Tag | Da | ata |
| В0 | I     | 100 | 00 | 00  |
| B1 | S     | 108 | 42 | 00  |
| B2 | M     | 130 | 00 | 02  |
| В3 | I     | 118 | 00 | 00  |

- Bus-ba

- Tag cor

What changes following the operation:

P4: read 118

- Writeba A: P4.B3: (S, 118, 80 00)
- Direct-r B: P4.B3: (S, 118, 80 40)
- 2 words P1.B3: (S, 118, 80 40)
- by dotte C: P4.B3: (S, 118, 80 40)
- right). P1.B3: (S, 118, 80 40)
  - M.118: (80 40)
  - D: P4.B3: (S, 118, 80 40)
    - P1.B3: (I, 118, 80 40)
    - M.118: (80 40)



| P2             |                 |                                                                                           |                                                                                                                                         |  |  |  |
|----------------|-----------------|-------------------------------------------------------------------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------|--|--|--|
| State Tag Data |                 |                                                                                           |                                                                                                                                         |  |  |  |
| M              | 120             | 01                                                                                        | 31                                                                                                                                      |  |  |  |
| M              | 128             | 54                                                                                        | 40                                                                                                                                      |  |  |  |
| S              | 110             | 00                                                                                        | 01                                                                                                                                      |  |  |  |
| I              | 118             | 80                                                                                        | 00                                                                                                                                      |  |  |  |
|                | State<br>M<br>M | State         Tag           M         120           M         128           S         110 | State         Tag         Date           M         120         01           M         128         54           S         110         00 |  |  |  |

| P3 | 3     |     |    |     |
|----|-------|-----|----|-----|
|    | State | Tag | Da | ata |
| В0 | Ι     | 100 | 00 | 00  |
| В1 | S     | 108 | 42 | 00  |
| B2 | I     | 110 | DF | FF  |
| B3 | M     | 138 | 00 | 28  |

| P4 | 1     |     |    |     |
|----|-------|-----|----|-----|
|    | State | Tag | Da | ata |
| В0 | I     | 100 | 00 | 00  |
| В1 | S     | 108 | 42 | 00  |
| B2 | M     | 130 | 00 | 02  |
| В3 | I     | 118 | 00 | 00  |



- Bus-bas

P4: read 118

What changes following the operation:

Writeba A: P4.B3: (S, 118, 80 00)

Direct-r B: P4.B3: (S, 118, 80 40)

2 words P1.B3: (S, 118, 80 40)

by dotte C: P4.B3: (S, 118, 80 40)

right). P1.B3: (S, 118, 80 40)

Tag cor M.118: (80 40) ✓

D: P4.B3: (S, 118, 80 40)

P1.B3: (I, 118, 80 40)

M.118: (80 40)





| P2             |   |     |    |    |  |  |
|----------------|---|-----|----|----|--|--|
| State Tag Data |   |     |    |    |  |  |
| В0             | M | 120 | 01 | 31 |  |  |
| В1             | M | 128 | 54 | 40 |  |  |
| В2             | S | 110 | 00 | 01 |  |  |
| ВЗ             | I | 118 | 80 | 00 |  |  |
| В3             | I | 118 | 80 | 00 |  |  |

| 3     |             |                                                                                           |                                                                                                                                         |
|-------|-------------|-------------------------------------------------------------------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------|
| State | Tag         | Da                                                                                        | ata                                                                                                                                     |
| Ι     | 100         | 00                                                                                        | 00                                                                                                                                      |
| S     | 108         | 42                                                                                        | 00                                                                                                                                      |
| Ι     | 110         | DF                                                                                        | FF                                                                                                                                      |
| M     | 138         | 00                                                                                        | 28                                                                                                                                      |
|       | State I S I | State         Tag           I         100           S         108           I         110 | State         Tag         Date           I         100         00           S         108         42           I         110         DF |

| P4 | ļ     |     |    |     |
|----|-------|-----|----|-----|
|    | State | Tag | Da | ata |
| В0 | I     | 100 | 00 | 00  |
| B1 | S     | 108 | 42 | 00  |
| B2 | M     | 130 | 00 | 02  |
| В3 | I     | 118 | 00 | 00  |

Address

Data

- Bus-ba

P1: read 138

What changes following the operation:

Writeba A: P1.B3: (S, 138, 80 00)

Direct-r B: P1.B3: (M, 138, 80 28)

2 words C: P1.B3: (S, 138, 00 28)

by dotte P3.B3: (S, 138, 00 28)

right). M.138: (00 28)

Tag cor D: P1.B3: (S, 138, 00 28)

P3.B3: (S, 138, 00 28)

M.118: (80 40)

M.138: (00 28)



| P2             |                 |                                                                                           |                                                                                                                                         |  |  |  |
|----------------|-----------------|-------------------------------------------------------------------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------|--|--|--|
| State Tag Data |                 |                                                                                           |                                                                                                                                         |  |  |  |
| M              | 120             | 01                                                                                        | 31                                                                                                                                      |  |  |  |
| M              | 128             | 54                                                                                        | 40                                                                                                                                      |  |  |  |
| S              | 110             | 00                                                                                        | 01                                                                                                                                      |  |  |  |
| I              | 118             | 80                                                                                        | 00                                                                                                                                      |  |  |  |
|                | State<br>M<br>M | State         Tag           M         120           M         128           S         110 | State         Tag         Date           M         120         01           M         128         54           S         110         00 |  |  |  |

| P3    |             |                                                                                           |                                                                                                                                         |  |  |  |
|-------|-------------|-------------------------------------------------------------------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------|--|--|--|
| State | Da          | ata                                                                                       |                                                                                                                                         |  |  |  |
| I     | 100         | 00                                                                                        | 00                                                                                                                                      |  |  |  |
| S     | 108         | 42                                                                                        | 00                                                                                                                                      |  |  |  |
| Ι     | 110         | DF                                                                                        | FF                                                                                                                                      |  |  |  |
| M     | 138         | 00                                                                                        | 28                                                                                                                                      |  |  |  |
|       | State I S I | State         Tag           I         100           S         108           I         110 | State         Tag         Date           I         100         00           S         108         42           I         110         DF |  |  |  |

| P4 | Ļ     |     |    |     |
|----|-------|-----|----|-----|
|    | State | Tag | Da | ata |
| В0 | I     | 100 | 00 | 00  |
| B1 | S     | 108 | 42 | 00  |
| B2 | M     | 130 | 00 | 02  |
| В3 | I     | 118 | 00 | 00  |



- Bus-bas

### What changes following the operation:

P1: read 138

Writeba A: P1.B3: (S, 138, 80 00)

Direct-r B: P1.B3: (M, 138, 80 28)

2 words C: P1.B3: (S, 138, 00 28)

by dotte P3.B3: (S, 138, 00 28)

right). M.138: (00 28)

Tag cor D: P1.B3: (S, 138, 00 28)

P3.B3: (S, 138, 00 28)

M.118: (80 40)

M.138: (00 28) ✓

